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The conditions of the chromosomes inside the nucleus in the Rabl configuration have 
been modelled as self-avoiding polymer chains under restraining conditions. To ensure 
that the chromosomes remain stretched out and lined up, we fixed their end points to 
two opposing walls. The numbers of segments A'^, the distances di and d2 between the 
fixpoints, and the wall-to-wall distance z (as measured in segment lengths) determine an 
approximate value for the Kuhn segment length fc; . We have simulated the movement of 
the chromosomes using molecular dynamics to obtain the expected distance distribution 
between the genetic loci in the absence of further attractive or repulsive forces. A com- 
parison to biological experiments on Drosophila Melanogaster yields information on the 
parameters for our model. With the correct parameters it is possible to draw conclusions 
on the strength and range of the attraction that leads to pairing. 

Keywords: biomolecules, structure and physical properties, gene expression, molecular 
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1. Introduction 

The process of gene silencing is a crucial building block for the picture geneticists 
developed during the last decade about the functioning of genes within all kinds of 
organisms. Gene silencing is a highly complex area of research, and several mecha- 
nisms have been identified that inhibit gene expression within the nucleus. 

The simplest molecular model to explain gene silencing postulates that specific 
repressors regulate the onset of transcription, by binding directly to specific DNA 
sequences and counteracting the action of activators and of the transcriptional ma- 
chinery. A second possibility is that repressors, bound at specific sequences called 
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silencing elements, might act by regulating the structure of the folded state of the 
DNA, called chromatin. In the cell nucleus, DNA is wrapped around histones to 
form the fundamental chromatin unit called nucleosome, and adjacent nucleosomes 
are able to fold into a higher-order chromatin fiber. These structures reduce the 
accessibility of DNA to the transcriptional machinery, and repressors might pre- 
vent transcription by stabilizing the binding of histones to DNA, or the folding of 
nucleosomes in compact higher-order chromatin structures. 

Beside these levels of regulation, another level exists: namely the three- 
dimensional organization of chromosomal domains in the cell nucleus during cellular 
differentiation and development UIM Jn several cases, gene silencing has been cor- 
related with relocation of chromosomal domains. In most of the published studies, 
gene silencing correlated with gene positioning close to heterochromatic compart- 
ments. Heterochromatin represents a highly compact region of chromatin where 
genes are stably repressed. 

Another case of gene silencing that shares common features with heterochro- 
matin silencing involves the proteins of the Polycomb group (PcG) . PcG proteins 
are highly conserved regulatory factors that are responsible for the maintenance 
of the silent state of important developmental genes, such as homeotic genes. In 
Drosophila melanogaster, PcG proteins form multimeric complexes and regulate 
their genes through binding to chromosomal regulatory elements named PcG re- 
sponse elements (PREs). This silencing involves repressive modifications on the 
target chromatin. In addition, it has been observed that silencing via PcG pro- 
teins and PREs is enhanced by the presence of multiple copies of PRE-containing 
elements in the nucleus. These copies may, but do not have to be on the same chro- 
mosome. Long-distance pairing between these two loci, which brings them closer 
together than they would usually be, leads to strong repression of the genes they 
control (Bantignies et al^J). This type of regulation represents silencing by geomet- 
rical closeness, established in interphase nuclei (see Fig. 

In this report, we tried to model long-distance interactions among PREs, with 
the long term goal to build predictive models for proximity and interactions of 
chromosomal domains. Within the model, chromosomes were assigned a Rabl con- 
figuration El in the nucleus, a situation which is present in Drosophila embryonic 
nuclei. We calculated the expected distance distribution of the two loci in question 
and compared this distribution to experimental results that were obtained previ- 
ously. 



2. The model 

In our model the two arms of the chromosomes carrying the gene loci are represented 
by two polymer chains. These chains are built up of ellipsoidal monomer segments 
with a ratio of ^ between the half-axis (for a detailed description of the physical 
model used see 1^. The chains have to be non-ideal, because the chromosomes 
cannot penetrate one another. Hence we have to demand that neither of the two 
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Figure 1. Schematic representation of long-distance pairing between two loci X and Y in inter- 
phase nuclei. Nuclei are represented in a Rabl configuration, in which centromeres are assembled 
near the apical pole of the nucleus, whereas the telomeres point toward the basal pole. The wavy 
lines represent the chromosomes projecting from centromeric heterochromatin toward the basal 
pole of the nucleus. Colored dots represent independent loci that are distant in a normal situa- 
tion (A) but can come in close proximity upon integration of PRE-containing elements (B). This 
phenomenon leads to enhanced silencing and is dependent on PcG proteins. 

monomers can occupy the same space at the same time. To ensure that our model is 
comparable to a Rabl configuration, where the DNA is " stretched" from the apical 
pole of the cell nucleus to its bottom pole (see Fig. we need a force acting on 
polymers which hinders them from curling. The easiest way to do so is to fix the 
chains between two walls. The distance between these walls should be large enough 
to ensure the stretching, yet small enough to allow for movement of the chains. In 
our model the ratio between chain length and wall-to-wall-distance ranges from 1.5 
to 5. Both chains have the same length, with the end-points fixed in a plane. The 
upper and lower end-points are symmetrical (see Fig. 12) • 

The distance between the fixed points on the first wall (c?i) is kept small so they 
are essentially at the same point compared to the chain length. In our simulations 
we varied c?i between 1 and j monomer length. 

For a free SAW some essential properties are well known (see e.g. El). Denoting 
the end-to-end distance by rj, its mean by r^e =<| r'e \> and its root mean square 
by i?_E = V<(|'e|— ^e)^>, then for long walks (number of steps N large), we 
know that Re scales as: 

i?£;(A^) « aTV (1) 

I' is called the Flory parameter, initially calculated by Flory 1^ to |, which is still 
a very good approximation. When looking at the end-to-end distance distribution 
function p{fl) for a free SAW, there are two different scaling laws, one for the region 
where | | is small and one for the region where it is large. For small values the 
distribution follows a power law: 
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whereas for large values a stretched exponential dampening factor dominates: 

hm pir-) = e-(*)^ x f, ( ^] , (3) 

where fi is some polynomial function. For the free case, 7 is a universal param- 
eter that depends solely on the dimensionality d. For d=3 we have 7 « |. The 
distribution p{re) depends on re solely through the ratio Changes in the ge- 
ometry of the chain environment (e.g. by the introduction of wedges or walls) lead 
to changes in the value of the parameter 7, but leave the overall scaling predictions 
unaffected "i^. In our study we are simulating two stretched chains with fixed end- 
points, which might be considered as a single one as long as the fixpoint distance 
di is kept small. One of our goals is to determine the shape of the distance distri- 
butions between arbitrary monomers under these highly restraining conditions. 
Even if considering the distribution function of the distance ri'^ between two ar- 
bitrary steps i and j of a free SAW, the form of the distance distribution changes 
from the end-to-end case. Though we might expect that a section of a free SAW 
would behave also like a shorter free SAW itself, this is not the case. The loss of 
entropy at the endpoints leads to a loss of the shape properties. 
In our case, the chains are fixed between two walls and therefore a force is exerted 
on them which restricts their ability of movement. Additionally, we have to expect 
further changes in the properties of the chains due to loss of entropy as we fix the 
position of the end point vectors of the walk precisely (the wall fixpoints), and 
others have to stay in restricted areas (because both start and end points of the 
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in one chain N 


z [a] 


1"* wall di [a] 


2'"^ wall d2 [a] 


obtained up to date 






Parameter Set A 




60 


30 


0.25 


0.25 


13,000 


80 


30 


0.25 


0.25 


31,000 


100 


30 


0.25 


0.25 


13,500 


150 


30 


0.25 


0.25 


10,000 






Parameter Set B 




60 


30 


0.25 


30 


14,800 


80 


30 


0.25 


30 


25,900 


100 


30 


0.25 


30 


28,100 


150 


30 


0.25 


30 


4,000 






Parameter Set C 




60 


45 


0.25 


0.25 


14,000 


80 


45 


0.25 


0.25 


31,000 


100 


45 


0.25 


0.25 


14,200 


150 


45 


0.25 


0.25 


13,800 






Parameter Set D 




60 


30 


1 


1 


18,000 


80 


30 


1 


1 


8,500 


100 


30 


1 


1 


12,200 


150 


30 


1 


1 


8,500 



walk are fixed) . This means we are sampling over a distinct subunit of the ensemble 
of configurations of all free SAWs, which most probably has consequences on the 
distributions Pni,n2(''ni.n2) the distances between monomer ni on chain 1 and 
monomer n2 on chain 2. Determining the shape of these distance distributions has 
been an important aspect of our simulations. We were able to verify by x^-minimum 
fits to ascertain that for most combinations of (ni,n2), Pni,n2 is Gaussian. However, 
in those situations where the distribution had a small mean value, the shape clearly 
deviated from the Gaussian form. In these cases our fits pointed towards a stretched 
exponential for the upper end of the distribution, where the stretching exponent is 
different from which was valid for the free end-to-end distance distributions (see 
equation[2l. This observation will have to be verified on simulations with improved 
statistics. 



3. Comparison with experiment 

Our simulations will be compared with two different sets of experiments. In both 
cases the two genetic loci in question were positioned on different chromosomes. 
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This makes our model appHcable as there is no direct connection between the chro- 
mosomal arms we consider. If they were positioned on different arms of the same 
chromosome, the centromer region might serve as a transmitter, any "tugging mo- 
tion" of one locus might be perceptible for the other one. Fixing both chains to 
different fixpoints on the wall does not incorporate this feature. In both experi- 
ments, copies of an element called Fab-7 {Fab-7 is an element containing a PRE 
that regulates an homeotic gene of Drosophila mclanogaster) have been inserted 
into different loci X and Y: 

Experiment I used Drosophila embryo cells, for which Fa&-7had been inserted at 
locus XI ("s(i") and Yl ("BX-C"). These loci are situated approximately 
at one third from the top of the nucleus according to F. Bantignies. 

Experiment II used Drosophila embryo cells, for which Fa6-7had been inserted 
at locus X2 ("sd") and Y2 ("38F"). Locus X2 ("sd") is still approximately 
one third from the top of the nucleus, but locus Y2 ("38F") is much closer 
to it (« one sixth to one eighth) according to F. Bantignies. 

To make the distance distributions obtained through the simulations comparable 
to these experimental results, we have to scale the data. We may assume that 
the fixpoints on the walls in our simulation correspond to (loose) binding of the 
chromosomes to the membrane of the nucleus as outlined in Fig. 13 We neglected 
the fix point distance at the top (di) as it is quite small compared to z. As we know 
the diameter D of the sphere, we can use it to find an approximate scaling relation. 
In fact, this is the highest scaling factor we may assume that allows us to map the 
simulation setup into the inside of the nucleus. If we were to consider the less likely 
scenario where the ends of our chromosomes are not fixed to the cell membrane, but 
to some other place within the nucleus, our scaling factor would have to be chosen 
smaller, as the distance between the beginning and the end points of the chains, 
which we hold fixed, will correspond to less than the diameter of the nucleus. Any 
larger Kuhn segment length would place at least one of the fixpoints outside the 
nucleus. 

The length c in Fig.|2|is most suitable for the derivation. In units of D we have: 



whereas in our arbitrary length scale, whose units we shall call [a], we have: 




(4) 




(5) 



which, using D « 5/iTO, leads to 
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Figure 3. Determination of length scale. Here z is the wall-to-wall distance of our simulation 
setup, d2 is the distance between the fixpoints on one wall (the distance di on the other wall has 
been neglected as ~ 0). The angle a as well as the distances a, b and c have been introduced 
solely for the sake of calculating the conversion factor from arbitrary length scale to micrometer 
and have no further meaning. 

For the parameter sets A, C and D (see table^, where d2 is much smaller than the 
wall-to-wall distance z, we may approximate equation © so that l[a] « ^[^m], in 
case B we have to use the complete formula. 

The value of the Kuhn segment length for chromosomes is not yet well known, 
tho ugh most biologists agree that it is of the order of a few hundred nm (see 
e.g.'"''^. When we rescale our data according to equation ((BJ, we also determine the 
resultant Kuhn segment and chain contour length (see table 12) • The Kuhn segment 
lengths are in the right order of magnitude. To have an immediate comparison to 
the experimental histograms, we also performed a rebinning of our simulation data 
into the same ranges Bantignies et al. used in their data processing. 

We evaluated the distance distributions between point ni on chain 1 and point 
77.2 on chain 2. In accordance with the position of the genes along their chromosomes, 
for experiment I {sd and BX-C) we chose ni, n2 € [0.3Af; 0.4Af] , where n = would 
correspond to the fixpoint on the top wall. For experiment II {sd and 38F) we de- 
cided to consider ranges of ni G [O.IA^; 0.2A^] and n2 £ [0.3iV; 0.4iV]. An example 
of the resulting graphs for parameter set D can be seen in Figs. 01 and El By visual 
judgement of these graphs we compared the simulation data to the biological ex- 
perimental data. When checking for possible agreement between the model and the 
experiment, we realized that there is most probably the possibility of identifying the 
distance distributions in the groups with two copies of the gene with distributions 
from our simulation. We realized that congruence between the curves increased as 
the number of segments per chain increased. We also found that a larger Kuhn seg- 
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Number of segments 
per chain N 


Approximate chain 
length L [fim] 


Resultant effective Kuhn 
segment length Ik [nm] 




Parameter Set A 


60 


10 


170 


80 


13 


170 


100 


17 


170 


150 


25 


170 



Parameter Set B 



60 


8 


130 


80 


11 


130 


100 


13 


130 


150 


20 


130 



Parameter Set C 



60 


6 


110 


80 


9 


110 


100 


11 


110 


150 


16 


110 




Parameter Set D 


60 


10 


170 


80 


13 


170 


100 


17 


170 


150 


25 


170 



ment length is favorable. In these comparisons we always neglected discrepancies in 
the very first bin (smallest distances) as the biological model predicts a short-range 
binding force keeping the genes together once they come within range. This force 

has not yet been incorporated in the simulation model. On the other hand, the 
graphs also give strong evidence that no parameter choice within our simple model 
allows for identification between the control group distributions and the simulation, 
because the full-width half-maximum (FWHM) of the experimental curves is large 
although the mean value is relatively small. Parameter choices that could produce 
a comparable FWHM in the simulation would always place the mean value of the 
distribution at a much larger value. 

4. Conclusions 

The simple "two chains, no interaction" (except self-avoidance) model employed in 
the simulations is most likely adequate to account for the observed distance distri- 
butions in the case of presence of two copies of Fab-7 (the "pairing" case). We were 
able to distinguish two trends: Those simulations that had the most segments were 
best suited to "reproduce" the experimental curve, as were setups with increasing 
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Figure 4. Here we see the different distance distributions that involve the sd and the BX-C loci. 
The four bars on the left hand side in each group represent different chain lengths of 60, 80, 100, 
and 150 Kuhn segment lengths, respectively. In the case shown the Kuhn segment length was 170 
nm. The next bar (turquoise) gives the experimental results when two copies of Fab-7 are present, 
the black and grey bars stand for the control groups with only one copy. 



Kuhn segment length. Using this information we have a good idea in which areas of 
the parameter space further simulations should be conducted to enable a numerical 
comparison between experimental and simulation data with the aim of determining 
strength and range of the pairing/binding force between the two copies of Fab-7. 

The failure of the model with regard to the control groups is rooted in its sim- 
plicity. Possibly there is a repulsive force acting that is keeping different genes in 
different compartments of the nucleus. The simulation data we already have to- 
gether with further biological data from Bantignies et al. concerning the distances 
not only between the two genetic loci, but also between other places along the 
chromosomes in question should enable us to improve our model by introducing 
adequate attractive and/or repulsive forces. 

An open question concerns the shape of the distance distribution function in 
different regimes. Whereas in some cases, x^-minimuni fits yield that the distri- 
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Figure 5. In the above graphics we see the different distance distributions that involve the sd and 
38F loci. The legend is identical to Fig.|3 



bution is clearly Gaussian, in others notable deviations were found, especially in 
distributions with small mean distances. Though we found evidence that the upper 
end of these might behave as a stretched exponential, further investigations with 
improved statistics will have to be conducted to solidify this result. 
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